skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Poelen, Jorrit"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Introduction This archive includes a tab-delimited (tsv) and comma-delimited (csv) version of the Discover Life bee species guide and world checklist (Hymenoptera: Apoidea: Anthophila). Discover Life is an important resource for bee species names and this update is from Draft-55, November 2020. Data were accessed and transformed into a tsv file in August 2023 using Global Biotic Interactions (GloBI) nomer software. GloBI now incorporates the Discover Life bee species guide and world checklist in its functionality for searching for bee interactions. Update! New Dataset also includes Subgenera Names A new, tab-delimited version of the Discover Life taxonomy as derived from Dorey et. al, 2023 can be found via Zenodo at https://doi.org/10.5281/zenodo.10463762. This version of the Discover Life world species guide and checklist includes subgeneric names. Citation Please cite the original source for this data as: Ascher, J. S. and J. Pickering. 2022.Discover Life bee species guide and world checklist (Hymenoptera: Apoidea: Anthophila).http://www.discoverlife.org/mp/20q?guide=Apoidea_species Draft-56, 21 August, 2022 nomer nomer is a command-line application for working with taxonomic resources offline. nomer incorporates many of the present taxonomic catalogs (e.g., catalog of life, ITIS, EOL, NCBI) and provides simple tools for comparing between resources or resolving taxonomic names based on one or more taxonomic name catalogs. Discover Life is in nomer version 0.5.1 and this full dataset can be recreated by installing nomer from https://github.com/globalbioticinteractions/nomer and running $ nomer list discoverlife > discoverlife.tsv Data Columns Discover Life provides a world name checklist and includes other names (synonyms and homonyms) that refer to the same species. In the tsv file, the provided name is both the accepted, or checklist name, or "other name." All names will be listed as a providedName. Below is an example subset of the transformed version of the data. providedExternalId= link to name on Discover Life providedName=an accepted or "other name" in the Discover Life bee checklist. "Other names" can be synonyms or homonyms. providedAuthorship=authorship for the providedName providedRank=rank of the providedName providedPath=higher taxonomy of the providedName. This will be the same as the accepted name or resolvedName relationName=relationship between the "other name" and the bee name in the Discover Life checklist. It may include itself resolvedExternalID=an accepted name in the Discover Life bee checklist resolvedExternalId=link to name on Discover Life resolvedAuthorship=authorship of the accepted, or checklist name resolvedRank=rank of the accepted, or checklist name resolvedPath=higher taxonomy of the accepted, or checklist name Changes No major changes to format in this version. References Jorrit Poelen, & José Augusto Salim. (2022). globalbioticinteractions/nomer: (0.2.11). Zenodo. https://doi.org/10.5281/zenodo.6128011 Poelen JH, Simons JD and Mungall CH. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005. Seltmann KC, Allen J, Brown BV, Carper A, Engel MS, Franz N, Gilbert E, Grinter C, Gonzalez VH, Horsley P, Lee S, Maier C, Miko I, Morris P, Oboyski P, Pierce NE, Poelen J, Scott VL, Smith M, Talamas EJ, Tsutsui ND, Tucker E (2021) Announcing Big-Bee: An initiative to promote understanding of bees through image and trait digitization. Biodiversity Information Science and Standards 5: e74037. https://doi.org/10.3897/biss.5.74037 Dorey, J.B., Fischer, E.E., Chesshire, P.R. et al. A globally synthesised and flagged bee occurrence dataset and cleaning workflow. Sci Data 10, 747 (2023). https://doi.org/10.1038/s41597-023-02626-w 
    more » « less
  2. Last modified: July 3, 2024 IntroductionThis dataset comprises all bee interactions indexed by Global Biotic Interactions (GloBI; Poelen et al. 2014). It is published quarterly by the Big Bee Project (Seltmann et al. 2021) to summarize all available knowledge about bee interactions from natural history collections, community science observations (i.e., iNaturalist), and the literature. Interactions include flower visitation, parasitic interactions (mite, viral), lecty, and many others. Data DescriptionPlease see the [integration process page](https://www.globalbioticinteractions.org/process) to better understand how Global Biotic Interactions combines datasets from various sources. The complete interaction dataset for all species can be accessed via https://www.globalbioticinteractions.org/data. Data is filtered for unique records based on the interaction description and source citation. Archives contain full data records and unique filtered records in tab-delimited format. Dataset column name definitions https://api.globalbioticinteractions.org/interactionFields or https://api.globalbioticinteractions.org/interactionFields Duplicate records occur in the database because more than one provider shares information. This is most frequently occuring in museum specimen data and duplicates can be identified evaluating the institutionCode, collectionCode and catalogNumber fields. The file catalogNumber_counts.tsv groups records by these three fields for this dataset, but does not filter out duplicate records. Additionally, this dataset includes the citation information provided by the data publisher. The provided sourceCitation may not include information about the primary provider (often the natural history collection) the specimen data originates and the catalogNumber should be referenced to understand the original source of the data. Summary statistics about the dataset can be found in the bees-only-review.pdf file. This review of all bee data indexed by Global Biotic Interactions was created using GloBI’s Interaction Data Review Report Framework via repository https://github.com/Big-Bee-Network/select-bee-interactions.sh. Metrics Date Total bee records 07-17-2020 232,906 01-24-2021 257,738 11-17-2021 226,160 06-01-2022 286,818 11-07-2022 429,308 01-18-2024 842,819 07-03-2024 1,109,057   Date Andrenidae Apidae Colletidae Halictidae 07-17-2020 73,463 106,222 20,821 58,880 01-24-2021 77,824 120,919 21,376 63,945 11-17-2021 25,535 134,517 10,568 43,070 06-01-2022 78,016 144,827 20,409 64,054 11-07-2022 84,172 171,378 30,792 79,155 01-18-2024 166,473 334,224 63,847 171,931 07-03-2024 289,400 371,953 83,337 190,562   Date Megachilidae Melittidae Stenotritidae 07-17-2020 44,449 2,511 23 01-24-2021 48,856 2,624 18 11-17-2021 37,001 995 9 06-01-2022 54,516 2,994 18 11-07-2022 61,391 2,396 24 01-18-2024 100,814 5,088 442 07-03-2024 162,587 4,964 438   Included Resources count sourceCitation 219440 Symbiota Collections of Arthropods Network (SCAN) 156437 University of Kansas Natural History Museum 150780 Digital Bee Collections Network, 2014 (and updates). Version: 2015-03-18. National Science Foundation grant DBI#0956388 134657 USGS Biodiversity Information Serving Our Nation (BISON) IPT 126820 http://iNaturalist.org is a place where you can record what you see in nature, meet other nature lovers, and learn about the natural world. 44522 PaDIL Bee records from the Pests and Diseases Image Library, http://www.padil.gov.au. 38658 University of Michigan Museum of Zoology Insect Division. Full Database Export 2020-11-20 provided by Erika Tucker and Barry Oconner. 27711 Carril OM, Griswold T, Haefner J, Wilson JS. (2018) Wild bees of Grand Staircase-Escalante National Monument: richness, abundance, and spatio-temporal beta-diversity. PeerJ 6:e5867 https://doi.org/10.7717/peerj.5867 15506 Seltmann, K., Van Wagner, J., Behm, R., Brown, Z., Tan, E., & Liu, K. (2020). BID: A project to share biotic interaction and ecological trait data about bees (Hymenoptera: Anthophila). UC Santa Barbara: Cheadle Center for Biodiversity and Ecological Restoration. Retrieved from https://escholarship.org/uc/item/1g21k7bf 14666 Web of Life. http://www.web-of-life.es . 14577 Pensoft Darwin Core Archives available via Integrated Publication Toolkit 13447 University of Colorado Museum of Natural History Entomology Collection 13296 https://mangal.io - the ecological interaction database. 10705 National Database Plant Pollinators. Center for Plant Conservation at San Diego Zoo Global. Accessed via https://saveplants.org/national-collection/pollinator-search/ on 2020-06-05. 8529 Ollerton, J., Trunschke, J. ., Havens, K. ., Landaverde-González, P. ., Keller, A. ., Gilpin, A.-M. ., Rodrigo Rech, A. ., Baronio, G. J. ., Phillips, B. J., Mackin, C. ., Stanley, D. A., Treanore, E. ., Baker, E. ., Rotheray, E. L., Erickson, E. ., Fornoff, F. ., Brearley, F. Q. ., Ballantyne, G. ., Iossa, G. ., Stone, G. N., Bartomeus, I. ., Stockan, J. A., Leguizamón, J., Prendergast, K. ., Rowley, L., Giovanetti, M., de Oliveira Bueno, R., Wesselingh, R. A., Mallinger, R., Edmondson, S., Howard, S. R., Leonhardt, S. D., Rojas-Nossa, S. V., Brett, M., Joaqui, T., Antoniazzi, R., Burton, V. J., Feng, H.-H., Tian, Z.-X., Xu, Q., Zhang, C., Shi, C.-L., Huang, S.-Q., Cole, L. J., Bendifallah, L., Ellis, E. E., Hegland, S. J., Straffon Díaz, S., Lander, T. A. ., Mayr, A. V., Dawson, R. ., Eeraerts, M. ., Armbruster, W. S. ., Walton, B. ., Adjlane, N. ., Falk, S. ., Mata, L. ., Goncalves Geiger, A. ., Carvell, C. ., Wallace, C. ., Ratto, F. ., Barberis, M. ., Kahane, F. ., Connop, S. ., Stip, A. ., Sigrist, M. R. ., Vereecken, N. J. ., Klein, A.-M., Baldock, K. ., & Arnold, S. E. J. . (2022). Pollinator-flower interactions in gardens during the COVID-19 pandemic lockdown of 2020. Journal of Pollination Ecology, 31, 87–96. https://doi.org/10.26786/1920-7603(2022)695 8014 Redhead, J.W.; Coombes, C.F.; Dean, H.J.; Dyer, R.; Oliver, T.H.; Pocock, M.J.O.; Rorke, S.L.; Vanbergen, A.J.; Woodcock, B.A.; Pywell, R.F. (2018). Plant-pollinator interactions database for construction of potential networks. NERC Environmental Information Data Centre. https://doi.org/10.5285/6d8d5cb5-bd54-4da7-903a-15bd4bbd531b 7630 CaraDonna, P.J. 2020. Temporal variation in plant-pollinator interactions, Rocky Mountain Biological Laboratory, CO, USA, 2013 - 2015 ver 1. Environmental Data Initiative. https://doi.org/10.6073/pasta/27dc02fe1655e3896f20326fed5cb95f (Accessed 2021-04-16). 6921 Purdue Entomological Research Collection 6911 Arizona State University Hasbrouck Insect Collection 6430 LaManna, JA, Burkle, LA, Belote, RT, Myers, JA. Biotic and abiotic drivers of plant–pollinator community assembly across wildfire gradients. J Ecol. 2020; 00: 1– 14. https://doi.org/10.1111/1365-2745.13530 . 6288 Pensoft Darwin Core Archives with associateTaxa columns 6269 Eardley C, Coetzer W. 2016. Catalogue of Afrotropical Bees. 6114 University of Michigan Museum of Zoology, Division of Insects 5089 Magrach, Ainhoa et al. (2017), Data from: Plant-pollinator networks in semi-natural grasslands are resistant to the loss of pollinators during blooming of mass-flowering crops, Dryad, Dataset, https://doi.org/10.5061/dryad.k0q1n 3860 Giselle Muschett & Francisco E. Fontúrbel. 2021. A comprehensive catalogue of plant – pollinator interactions for Chile 3720 Frost Entomological Museum, Pennsylvania State University 3670 Natural History Collections managed by Arctos (https://arctosdb.org) accessed via https://vertnet.org . 3620 Sarah E Miller. 6/19/2015. Species associations manually extracted from datasets https://www.nceas.ucsb.edu/interactionweb/resources.html. 3581 Robert L. Minckley San Bernardino Valley from the year 2000 to 2011. 3581 University of New Hampshire Collection of Insects and other Arthropods UNHC-UNHC 3581 University of New Hampshire Donald S. Chandler Entomological Collection 2242 Sarah E. Miller. 07/06/2017. Information extracted from dataset https://www.idigbio.org/portal/recordsets/db4bb0df-8539-4617-ab5f-eb118aa3126b. 2223 Bartomeus, Ignasi (2013): Plant-Pollinator Network Data. figshare. Dataset. https://doi.org/10.6084/m9.figshare.154863.v1 2110 Illinois Natural History Survey Insect Collection 2074 Florida State Collection of Arthropods 2035 Ed Baker; Ian J. Kitching; George W. Beccaloni; Amoret Whitaker et al. (2016). Dataset: NHM Interactions Bank. Natural History Museum Data Portal (data.nhm.ac.uk). https://doi.org/10.5519/0060767 1762 Poelen, Jorrit H. (2023). A biodiversity dataset graph: Biological Associations in TaxonWorks hash://sha256/a4d651aac5220487835e6178511886e98b845b2d98cb7c5447fb2b042e0654d2 hash://md5/849edbe55e31e54ea5cdaba0188c5655 (0.2) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8253729 1681 Harvard University M, Morris P J (2021). Museum of Comparative Zoology, Harvard University. Museum of Comparative Zoology, Harvard University. 1563 Ballantyne, Gavin; Baldock, Katherine C. R.; Willmer, Pat G. (2015), Data from: Constructing more informative plant-pollinator networks: visitation and pollen deposition networks in a heathland plant community, Dryad, Dataset, https://doi.org/10.5061/dryad.17pp3 1365 Sarah E Miller. 5/30/2016. Interations from various papers. 1281 Sarah E Miller. 4/18/2016. Species associations from Wardeh, M. et al. Database of host-pathogen and related species interactions, and their global distribution. Sci. Data 2:150049 doi: 10.1038/sdata.2015.49 (2015) 1102 University of California Santa Barbara Invertebrate Zoology Collection 1086 Cohen JM, Sauer EL, Santiago O, Spencer S, Rohr JR. 2020. Divergent impacts of warming weather on wildlife disease risk across climates. Science. doi:10.1126/science.abb1702 939 Allen Hurlbert. 2017. Avian Diet Database. 918 Texas A&M University Insect Collection 906 Del Risco, A.A., Montoya, Á.M., García, V. et al. Data synthesis and dynamic visualization converge into a comprehensive biotic interaction network: a case study of the urban and rural areas of Bogotá D.C.. Urban Ecosyst (2021). https://doi.org/10.1007/s11252-021-01133-3 872 Cristina Preda and Quentin Groom. 2014. Species associations manually extracted from literature. 754 United States Geological Survey (USGS) Pollinator Library. https://www.npwrc.usgs.gov/pollinator. 752 Sarah E Miller. 6/22/2015. Species associations manually extracted from datasets https://www.nceas.ucsb.edu/interactionweb/resources.html. 750 RCPol: Online Pollen Catalogs Network. 2016. https://rcpol.org.br/ 744 Classen, Alice; Steffan-Dewenter, Ingolf (2020): Plant-pollinator interactions along an elevational gradient on Mt. Kilimanjaro. PANGAEA, https://doi.org/10.1594/PANGAEA.911390 704 Yale University Peabody Museum Collections Data Portal 677 The Albert J. Cook Arthropod Research Collection 541 Udy, Kristy; Reininghaus, Hannah; Scherber, Christoph; Tscharntke, Teja (2020), Data from: Plant-pollinator interactions along an urbanization gradient from cities and villages to farmland landscapes, Dryad, Dataset, https://doi.org/10.5061/dryad.4mw6m906s 524 Pardee, G.L., Ballare, K.M., Neff, J.L., Do, L.Q., Ojeda, D., Bienenstock, E.J., Brosi, B.J., Grubesic, T.H., Miller, J.A., Tong, D. and Jha, S., 2023. Local and Landscape Factors Influence Plant-Pollinator Networks and Bee Foraging Behavior across an Urban Corridor. Land, 12(2), p.362. https://www.mdpi.com/2073-445X/12/2/362 511 Sarah E Miller. 6/25/2015. Species associations manually extracted from Robertson, C. 1929. Flowers and insects: lists of visitors to four hundred and fifty-three flowers. Carlinville, IL, USA, C. Robertson. 511 The International Barcode of Life Consortium (2016). International Barcode of Life project (iBOL). Occurrence dataset https://doi.org/10.15468/inygc6 454 Seltzer, Carrie; Wysocki, William; Palacios, Melissa; Eickhoff, Anna; Pilla, Hannah; Aungst, Jordan; Mercer, Aaron; Quicho, Jamie; Voss, Neil; Xu, Man; J. Ndangalasi, Henry; C. Lovett, Jon; J. Cordeiro, Norbert (2015): Plant-animal interactions from Africa. figshare. https://dx.doi.org/10.6084/m9.figshare.1526128 342 Mycology Collections Data Portal (MyCoPortal). 2020. https://mycoportal.org 292 Global Web Database (http://globalwebdb.com): an online collection of food webs. Accessed via https://www.globalwebdb.com/Service/DownloadArchive on 2017-10-12. 268 University of Wisconsin Stevens Point, Stephen J. Taft Parasitological Collection 241 University of Hawaii Insect Museum 168 Sarah E Miller. 12/13/2016. Species associations manually extracted from Onstad, D.W. EDWIP: Ecological Database of the World's Insect Pathogens. Champaign, Illinois: Illinois Natural History Survey, [23/11/2016]. http://insectweb.inhs.uiuc.edu/Pathogens/EDWIP. 153 California Academy of Sciences Entomology and Entomology Type Collection 127 Olito, Colin; Fox, Jeremy W. (2015), Data from: Species traits and abundances predict metrics of plant–pollinator network structure, but not pairwise interactions, Dryad, Dataset, https://doi.org/10.5061/dryad.7st32 114 Kari Lintulaakso. 2023. MammalBase Diet Database. 106 Brose, U. (2018). GlobAL daTabasE of traits and food Web Architecture (GATEWAy) version 1.0 [Data set]. German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig. https://doi.org/10.25829/IDIV.283-3-756 104 Groom, Q.J., Maarten De Groot, M. & Marčiulynienė, D. (2020) Species interation data manually extracted from literature for species . 96 Eneida L. Hatcher, Sergey A. Zhdanov, Yiming Bao, Olga Blinkova, Eric P. Nawrocki, Yuri Ostapchuck, Alejandro A. Schäffer, J. Rodney Brister, Virus Variation Resource – improved response to emergent viral outbreaks, Nucleic Acids Research, Volume 45, Issue D1, January 2017, Pages D482–D490, https://doi.org/10.1093/nar/gkw1065 . 93 Jakovos Demetriou and Quentin Groom 2014. Species associations of Sceliphron manually extracted from literature. 92 San Diego Natural History Museum 80 Price Institute of Parasite Research, School of Biological Sciences, University of Utah 59 National Museum of Natural History, Smithsonian Institution IPT RSS Feed 56 Poelen, JH (2016). Plant pathogen-host interactions scraped from Common Names of Plant Diseases published by the American Phytopathological Society at http://www.apsnet.org/publications/commonnames/Pages/default.aspx using Samara, a Planteome (http://planteome.org) plant-trait scraper. 50 Florez-Montero, G.L., Muylaert, R.L., Nogueira, M.R., Geiselman, C., Santana, S.E., Stevens, R.D., Tschapka, M., Rodrigues, F.A. and Mello, M.A.R. (2022), NeoBat Interactions: A data set of bat–plant interactions in the Neotropics. Ecology. Accepted Author Manuscript e3640. https://doi.org/10.1002/ecy.3640 50 Ferrer-Paris, José R.; Sánchez-Mercado, Ada Y.; Lozano, Cecilia; Zambrano, Liset; Soto, José; Baettig, Jessica; Leal, María (2014): A compilation of larval host-plant records for six families of butterflies (Lepidoptera: Papilionoidea) from available electronic resources. figshare. http://dx.doi.org/10.6084/m9.figshare.1168861 39 Pocock, Michael J. O.; Evans, Darren M.; Memmott, Jane (2012), Data from: The robustness and restoration of a network of ecological networks, Dryad, Dataset, https://doi.org/10.5061/dryad.3s36r118 37 Sarah E Miller. 9/19/2016. Species associations extracted from Graystock, P., Blane, E.J., McFrederick, Q.S., Goulson, D. and Hughes, W.O., 2016. Do managed bees drive parasite spread and emergence in wild bees?. International Journal for Parasitology: Parasites and Wildlife, 5(1), pp.64-75. 36 Mihara, T., Nishimura, Y., Shimizu, Y., Nishiyama, H., Yoshikawa, G., Uehara, H., Hingamp, P., Goto, S., and Ogata, H.; Linking virus genomes with host taxonomy. Viruses 8, 66 doi:10.3390/v8030066 (2016). 36 Quentin J. Groom. 2020. Species interactions of species on the List of invasive alien species of Union concern 33 IPBES. (2016). The assessment report of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services on pollinators, pollination and food production. Table 2.4.3 p88 Zenodo. https://doi.org/10.5281/zenodo.3402857 30 Brigham Young University Arthropod Museum 24 Geiselman, Cullen K. & Sarah Younger. 2020. Bat Eco-Interactions Database. www.batbase.org 24 Geiselman, Cullen K. and Tuli I. Defex. 2015. Bat Eco-Interactions Database. www.batplant.org 23 Agosti, Donat. 2020. Transcription of Linné, C. von, 1758. Systema naturae per regna tria naturae secundum classes, ordines, genera, species, cum characteribus, differentiis, synonymis, locis. Available at: http://dx.doi.org/10.5962/bhl.title.542 . 21 Species Connect. https://speciesconnect.com 17 http://invertebrates.si.edu/parasites.htm 14 Gandhi, K. J. K., & Herms, D. A. (2009). North American arthropods at risk due to widespread Fraxinus mortality caused by the Alien Emerald ash borer. Biological Invasions, 12(6), 1839–1846. doi:10.1007/s10530-009-9594-1. 12 Food Webs and Species Interactions in the Biodiversity of UK and Ireland (Online). 2017. Data provided by Malcolm Storey. Also available from http://bioinfo.org.uk. 12 Sarah E Miller. 5/28/2015. Arnaud, Paul Henri. A Host-parasite Catalog of North American Tachinidae (Diptera). Washington, D.C.: U.S. Dept. of Agriculture, Science and Education Administration, 1978. 10 University of California Santa Barbara Herbarium 9 Field Museum of Natural History IPT 8 Brose, U. et al., 2005. Body sizes of consumers and their resources. Ecology, 86(9), pp.2545–2545. Available at: http://dx.doi.org/10.1890/05-0379. 8 Strong, Justin S., and Shawn J. Leroux. 2014. "Impact of Non-Native Terrestrial Mammals on the Structure of the Terrestrial Mammal Food Web of Newfoundland, Canada." PLOS ONE 9 (8): e106264. https://doi.org/10.1371/journal.pone.0106264 7 Chen L, Liu B, Wu Z, Jin Q, Yang J, 2017. DRodVir: A resource for exploring the virome diversity in rodents. J Genet Genomics. 44(5):259-264. 5 Froese, R. and D. Pauly. Editors. 2018. FishBase. World Wide Web electronic publication. www.fishbase.org, version (10/2018). 5 Pinnegar, J.K. (2014). DAPSTOM - An Integrated Database & Portal for Fish Stomach Records. Version 4.7. Centre for Environment, Fisheries & Aquaculture Science, Lowestoft, UK. February 2014, 39pp. 4 Aja Sherman, Cullen Geiselman. 2021. Bat Co-Roosting Database 4 Bernice Pauahi Bishop Museum, J. Linsley Gressitt Center for Research in Entomology 4 Mollentze, Nardus, & Streicker, Daniel G. (2019). Viral zoonotic risk is homogenous among taxonomic orders of mammalian and avian reservoir hosts (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3516613 4 Sarah E Miller. 7/7/2016. Text gathered from Wirta, H.K., Vesterinen, E.J., Hambäck, P.A., Weingartner, E., Rasmussen, C., Reneerkens, J., Schmidt, N.M., Gilg, O. and Roslin, T., 2015. Exposing the structure of an Arctic food web. Ecology and evolution, 5(17), pp.3842-3856. 4 Sarah E Miller. 9/15/2016. Species associations extracted from http://parasiticplants.siu.edu/index.html. 4 Sarah E. Miller. 04/14/2015. Extracted from literature Scott, J.A. 1986.  The Butterflies of North America.  Stanford University Press, Stanford, CA 4 Scott L. Gardner and Gabor R. Racz (2021). University of Nebraska State Museum - Parasitology. Harold W. Manter Laboratory of Parasitology. University of Nebraska State Museum. 2 Deans, Andrew (2021). Catalog of Rose Gall, Herb Gall, and Inquiline Gall Wasps (Hymenoptera: Cynipidae) of the United States, Canada, and Mexico 2 Jorrit H. Poelen. 2017. Species interactions associated with known species interaction datasets. 2 Museum for Southwestern Biology (MSB) Parasite Collection 2 Sarah E Miller. 4/20/2015. Species associations manually extracted from various papers and articles from site https://repository.si.edu 2 Seltmann, Katja C. 2020. Biotic species interactions about ticks manually extracted from literature. 2 Species Interactions of Australia Database (SIAD): Helping us to understand species interactions in Australia and beyond. http://www.discoverlife.org/siad/ . 1 Chen L, Liu B, Yang J, Jin Q, 2014. DBatVir: the database of bat-associated viruses. Database (Oxford). 2014:bau021. doi:10.1093/database/bau021 1 Grundler MC (2020) SquamataBase: a natural history database and R package for comparative biology of snake feeding habits. Biodiversity Data Journal 8: e49943. https://doi.org/10.3897/BDJ.8.e49943 1 Gunther KA et al. 2014 Dietary breadth of grizzly bears in the Greater Yellowstone Ecosystem. Ursus 25(1):60-72 1 Sarah E Miller. 7/6/2016. Arctos collection. Included files bee_data_BID.sh - script for separating bee records into family uniq_citations.tsv - list of unique citations indicating bee interactions Andrenidae_data_unique.tsv - Andrenidae records     Apidae_data_unique.tsv - Apidae records         Colletidae_data_unique.tsv - Colletidae records Halictidae_data_unique.tsv - Halictidae records     Megachilidae_data_unique.tsv - Megachilidae records     Melittidae_data_unique.tsv - Melittidae records Stenotritidae_data_unique.tsv - Stenotritidae records bees-only-interactions.tsv.zip - list of all bee interaction data indexed on Global Biotic Interactions from GloBI version 2024-06-07 produced by https://github.com/Big-Bee-Network/select-bee-interactions.sh bees-only-review.pdf - Review of all bee data indexed by Global Biotic Interactions using GloBI’s Interaction Data Review Report Framework via repository https://github.com/Big-Bee-Network/select-bee-interactions.sh catalogNumber_counts.tsv - counts by catalogNumber in dataset. Duplicate catalog numbers indicate duplicated data shared by multiple data providers. ReferencesGloBI Community. (2024). Global Biotic Interactions: Interpreted Data Products hash://md5/946f7666667d60657dc89d9af8ffb909 hash://sha256/4e83d2daee05a4fa91819d58259ee58ffc5a29ec37aa7e84fd5ffbb2f92aa5b8 (0.7) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.11552565. Poelen JH, Simons JD, Mungall CJ (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005 Seltmann KC, Allen J, Brown BV, Carper A, Engel MS, Franz N, Gilbert E, Grinter C, Gonzalez VH, Horsley P, Lee S, Maier C, Miko I, Morris P, Oboyski P, Pierce NE, Poelen J, Scott VL, Smith M, Talamas EJ, Tsutsui ND, Tucker E (2021) Announcing Big-Bee: An initiative to promote understanding of bees through image and trait digitization. Biodiversity Information Science and Standards 5: e74037. https://doi.org/10.3897/biss.5.74037 Poelen, JS & Seltmann, KS (2024) Bees Only Please: Bees Only Please: Selecting Hundreds of Thousands of Possible Bee Interactions Using a Laptop, Open Datasets, and Small (but Mighty) Commandline Tools. https://www.globalbioticinteractions.org/2024/06/07/bees-only-please Ascher, J. S. and J. Pickering (2020) Discover Life bee species guide and world checklist (Hymenoptera: Apoidea: Anthophila). http://www.discoverlife.org/mp/20q?guide=Apoidea_species. Acknowledgements This project is supported by the National Science Foundation. Award numbers: DBI:2102006, DBI:2101929, DBI:2101908, DBI:2101876, DBI:2101875, DBI:2101851, DBI:2101345, DBI:2101913, DBI:2101891 and DBI:2101850 
    more » « less
  3. Abstract Commonly used data citation practices rely on unverifiable retrieval methods which are susceptible to content drift, which occurs when the data associated with an identifier have been allowed to change. Based on our earlier work on reliable dataset identifiers, we propose signed citations, i.e., customary data citations extended to also include a standards-based, verifiable, unique, and fixed-length digital content signature. We show that content signatures enable independent verification of the cited content and can improve the persistence of the citation. Because content signatures are location- and storage-medium-agnostic, cited data can be copied to new locations to ensure their persistence across current and future storage media and data networks. As a result, content signatures can be leveraged to help scalably store, locate, access, and independently verify content across new and existing data infrastructures. Content signatures can also be embedded inside content to create robust, distributed knowledge graphs that can be cited using a single signed citation. We describe applications of signed citations to solve real-world data collection, identification, and citation challenges. 
    more » « less
  4. Commonly used data citation practices rely on unverifiable retrieval methods which are susceptible to “content drift”, which occurs when the data associated with an identifier have been allowed to change. Based on our earlier work on reliable dataset identifiers, we propose signed citations, i.e., customary data citations extended to also include a standards-based, verifiable, unique, and fixed-length digital content signature. We show that content signatures enable independent verification of the cited content and can improve the persistence of the citation. Because content signatures are location- and storage-medium-agnostic, cited data can be copied to new locations to ensure their persistence across current and future storage media and data networks. As a result, content signatures can be leveraged to help scalably store, locate, access, and independently verify content across new and existing data infrastructures. Content signatures can also be embedded inside content to create robust, distributed knowledge graphs that can be cited using a single signed citation. We describe real-world applications of signed citations used to cite and compile distributed data collections, cite specific versions of existing data networks, and stabilize associations between URLs and content. 
    more » « less
  5. The Global Biodiversity Information Facility (GBIF 2022a) has indexed more than 2 billion occurrence records from 70,147 datasets. These datasets often include "hidden" biotic interaction data because biodiversity communities use the Darwin Core standard (DwC, Wieczorek et al. 2012) in different ways to document biotic interactions. In this study, we extracted biotic interactions from GBIF data using an approach similar to that employed in the Global Biotic Interactions (GloBI; Poelen et al. 2014) and summarized the results. Here we aim to present an estimation of the interaction data available in GBIF, showing that biotic interaction claims can be automatically found and extracted from GBIF. Our results suggest that much can be gained by an increased focus on development of tools that help to index and curate biotic interaction data in existing datasets. Combined with data standardization and best practices for sharing biotic interactions, such as the initiative on plant-pollinators interaction (Salim 2022), this approach can rapidly contribute to and meet open data principles (Wilkinson 2016). We used Preston (Elliott et al. 2020), open-source software that versions biodiversity datasets, to copy all GBIF-indexed datasets. The biodiversity data graph version (Poelen 2020) of the GBIF-indexed datasets used during this study contains 58,504 datasets in Darwin Core Archive (DwC-A) format, totaling 574,715,196 records. After retrieval and verification, the datasets were processed using Elton. Elton extracts biotic interaction data and supports 20+ existing file formats, including various types of data elements in DwC records. Elton also helps align interaction claims (e.g., host of, parasite of, associated with) to the Relations Ontology (RO, Mungall 2022), making it easier to discover datasets across a heterogeneous collection of datasets. Using specific mapping between interaction claims found in the DwC records to the terms in RO*1, Elton found 30,167,984 potential records (with non-empty values for the scanned DwC terms) and 15,248,478 records with recognized interaction types. Taxonomic name validation was performed using Nomer, which maps input names to names found in a variety of taxonomic catalogs. We only considered an interaction record valid where the interaction type could be mapped to a term in RO and where Nomer found a valid name for source and target taxa. Based on the workflow described in Fig. 1, we found 7,947,822 interaction records (52% of the potential interactions). Most of them were generic interactions ( interacts_ with , 87.5%), but the remaining 12.5% (993,477 records) included host-parasite and plant-animal interactions. The majority of the interactions records found involved plants (78%), animals (14%) and fungi (6%). In conclusion, there are many biotic interactions embedded in existing datasets registered in large biodiversity data indexers and aggregators like iDigBio, GBIF, and BioCASE. We exposed these biotic interaction claims using the combined functionality of biodiversity data tools Elton (for interaction data extraction), Preston (for reliable dataset tracking) and Nomer (for taxonomic name alignment). Nonetheless, the development of new vocabularies, standards and best practice guides would facilitate aggregation of interaction data, including the diversification of the GBIF data model (GBIF 2022b) for sharing biodiversity data beyond occurrences data. That is the aim of the TDWG Interest Group on Biological Interactions Data (TDWG 2022). 
    more » « less
  6. null (Ed.)
    A wealth of information about how parasites interact with their hosts already exists in collections, scientific publications, specialized databases, and grey literature. The US National Science Foundation-funded Terrestrial Parasite Tracker Thematic Collection Network (TPT) project began in 2019 to help build a comprehensive picture of arthropod ectoparasites including the evolution of these parasite-host biotic associations, distributions, and the ecological interactions of disease vectors. TPT is a network of biodiversity collections whose data can assist scientists, educators, land managers, and policymakers to better understand the complex relationship between hosts and parasites including emergent properties that may explain the causes and frequency of human and wildlife pathogens. TPT member collections make their association information easier to access via Global Biotic Interactions (GloBI, Poelen et al. 2014), which is periodically archived through Zenodo to track progress in the TPT project. TPT leverages GloBI's ability to index biotic associations from specimen occurrence records that come from existing management systems (e.g., Arctos, Symbiota, EMu, Excel, MS Access) to avoid having to completely rework existing, or build new, cyber-infrastructures before collections can share data. TPT-affiliated collection managers use collection-specific translation tables to connect their verbatim (or original) terms used to describe associations (e.g., "ex", "found on", "host") to their interpreted, machine-readable terms in the OBO Relations Ontology (RO). These interpreted terms enable searches across previously siloed association record sets, while the original verbatim values remain accessible to help retain provenance and allow for interpretation improvements. TPT is an ambitious project, with the goal to database label data from over 1.2 million specimens of arthropod parasites of vertebrates coming from 22 collections across North America. In the first year of the project, the TPT collections created over 73,700 new records and 41,984 images. In addition, 17 TPT data providers and three other collaborators shared datasets that are now indexed by GloBI, visible on the TPT GloBI project page. These datasets came from collection specimen occurrence records and literature sources. Two TPT data archives that capture and preserve the changes in the data coming from TPT to GloBI were published through Zenodo (Poelen et al. 2020a, Poelen et al. 2020b). The archives document the changes in how data are shared by collections including the biotic association data format and quantity of data captured. The Poelen et al. 2020b report included all TPT collections and biotic interactions from Arctos collections in VertNet and the Symbiota Collection of Arthropods Network (SCAN). The total number of interactions included in this report was 376,671 records (500,000 interactions is the overall goal for TPT). In addition, close coordination with TPT collection data managers including many one-on-one conversations, a workshop, and a webinar (Sullivan et al. 2020) was conducted to help guide the data capture of biotic associations. GloBI is an effective tool to help integrate biotic association data coming from occurrence records into an openly accessible, global, linked view of existing species interaction records. The results gleaned from the TPT workshop and Zenodo data archives demonstrate that minimizing changes to existing workflows allow for custom interpretation of collection-specific interaction terms. In addition, including collection data managers in the development of the interaction term vocabularies is an important part of the process that may improve data sharing and the overall downstream data quality. 
    more » « less
  7. {"Abstract":["A biodiversity dataset graph: BHL<\/p>\n\nThe intended use of this archive is to facilitate (meta-)analysis of the Biodiversity Heritage Library (BHL). The Biodiversity Heritage Library improves research methodology by collaboratively making biodiversity literature openly available to the world as part of a global biodiversity community.<\/p>\n\nThis dataset provides versioned snapshots of the BHL network as tracked by Preston [2] between 2019-05-19 and 2020-05-09 using "preston update -u https://biodiversitylibrary.org".<\/p>\n\nThe archive consists of 256 individual parts (e.g., preston-00.tar.gz, preston-01.tar.gz, ...) to allow for parallel file downloads. The archive contains three types of files: index files, provenance logs and data files. In addition, index files have been individually included in this dataset publication to facilitate remote access. Index files provide a way to links provenance files in time to establish a versioning mechanism. Provenance files describe how, when, what and where the BHL content was retrieved. For more information, please visit https://preston.guoda.bio or https://doi.org/10.5281/zenodo.1410543 .  <\/p>\n\nTo retrieve and verify the downloaded BHL biodiversity dataset graph, first concatenate all the downloaded preston-*.tar.gz files (e.g., cat preston-*.tar.gz > preston.tar.gz). Then, extract the archives into a "data" folder. Alternatively, you can use the preston[2] command-line tool to "clone" this dataset using:<\/p>\n\n$$ java -jar preston.jar clone --remote https://zenodo.org/record/3849560/files<\/p>\n\nAfter that, verify the index of the archive by reproducing the following provenance log history:<\/p>\n\n$$ java -jar preston.jar history\n<0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/pav/hasVersion> <hash://sha256/89926f33157c0ef057b6de73f6c8be0060353887b47db251bfd28222f2fd801a> .\n<hash://sha256/41b19aa9456fc709de1d09d7a59c87253bc1f86b68289024b7320cef78b3e3a4> <http://purl.org/pav/previousVersion> <hash://sha256/89926f33157c0ef057b6de73f6c8be0060353887b47db251bfd28222f2fd801a> .\n<hash://sha256/7582d5ba23e0d498ca4f55c29408c477d0d92b4fdcea139e8666f4d78c78a525> <http://purl.org/pav/previousVersion> <hash://sha256/41b19aa9456fc709de1d09d7a59c87253bc1f86b68289024b7320cef78b3e3a4> .\n<hash://sha256/a70774061ccded1a45389b9e6063eb3abab3d42813aa812391f98594e7e26687> <http://purl.org/pav/previousVersion> <hash://sha256/7582d5ba23e0d498ca4f55c29408c477d0d92b4fdcea139e8666f4d78c78a525> .\n<hash://sha256/007e065ba4b99867751d688754aa3d33fa96e6e03133a2097e8a368d613cd93a> <http://purl.org/pav/previousVersion> <hash://sha256/a70774061ccded1a45389b9e6063eb3abab3d42813aa812391f98594e7e26687> .\n<hash://sha256/4fb4b4d8f1ae2961311fb0080e817adb2faa746e7eae15249a3772fbe2d662a1> <http://purl.org/pav/previousVersion> <hash://sha256/007e065ba4b99867751d688754aa3d33fa96e6e03133a2097e8a368d613cd93a> .\n<hash://sha256/67cc329e74fd669945f503917fbb942784915ab7810ddc41105a82ebe6af5482> <http://purl.org/pav/previousVersion> <hash://sha256/4fb4b4d8f1ae2961311fb0080e817adb2faa746e7eae15249a3772fbe2d662a1> .\n<hash://sha256/e46cd4b0d7fdb51ea789fa3c5f7b73591aca62d2d8f913346d71aa6cf0745c9f> <http://purl.org/pav/previousVersion> <hash://sha256/67cc329e74fd669945f503917fbb942784915ab7810ddc41105a82ebe6af5482> .\n<hash://sha256/9215d543418a80510e78d35a0cfd7939cc59f0143d81893ac455034b5e96150a> <http://purl.org/pav/previousVersion> <hash://sha256/e46cd4b0d7fdb51ea789fa3c5f7b73591aca62d2d8f913346d71aa6cf0745c9f> .\n<hash://sha256/1448656cc9f339b4911243d7c12f3ba5366b54fff3513640306682c50f13223d> <http://purl.org/pav/previousVersion> <hash://sha256/9215d543418a80510e78d35a0cfd7939cc59f0143d81893ac455034b5e96150a> .\n<hash://sha256/7ee6b16b7a5e9b364776427d740332d8552adf5041d48018eeb3c0e13ccebf27> <http://purl.org/pav/previousVersion> <hash://sha256/1448656cc9f339b4911243d7c12f3ba5366b54fff3513640306682c50f13223d> .\n<hash://sha256/34ccd7cf7f4a1ea35ac6ae26a458bb603b2f6ee8ad36e1a58aa0261105d630b1> <http://purl.org/pav/previousVersion> <hash://sha256/7ee6b16b7a5e9b364776427d740332d8552adf5041d48018eeb3c0e13ccebf27> .<\/p>\n\nTo check the integrity of the extracted archive, confirm that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while.<\/p>\n\n$ java -jar preston.jar verify\nhash://sha256/e0c131ebf6ad2dce71ab9a10aa116dcedb219ae4539f9e5bf0e57b84f51f22ca    file:/home/preston/preston-bhl/data/e0/c1/e0c131ebf6ad2dce71ab9a10aa116dcedb219ae4539f9e5bf0e57b84f51f22ca    OK    CONTENT_PRESENT_VALID_HASH    49458087    hash://sha256/e0c131ebf6ad2dce71ab9a10aa116dcedb219ae4539f9e5bf0e57b84f51f22ca\nhash://sha256/1a57e55a780b86cff38697cf1b857751ab7b389973d35113564fe5a9a58d6a99    file:/home/preston/preston-bhl/data/1a/57/1a57e55a780b86cff38697cf1b857751ab7b389973d35113564fe5a9a58d6a99    OK    CONTENT_PRESENT_VALID_HASH    25745    hash://sha256/1a57e55a780b86cff38697cf1b857751ab7b389973d35113564fe5a9a58d6a99\nhash://sha256/85efeb84c1b9f5f45c7a106dd1b5de43a31b3248a211675441ff584a7154b61c    file:/home/preston/preston-bhl/data/85/ef/85efeb84c1b9f5f45c7a106dd1b5de43a31b3248a211675441ff584a7154b61c    OK    CONTENT_PRESENT_VALID_HASH    519892    hash://sha256/85efeb84c1b9f5f45c7a106dd1b5de43a31b3248a211675441ff584a7154b61c\nhash://sha256/251e5032afce4f1e44bfdc5a8f0316ca1b317e8af41bdbf88163ab5bd2b52743    file:/home/preston/preston-bhl/data/25/1e/251e5032afce4f1e44bfdc5a8f0316ca1b317e8af41bdbf88163ab5bd2b52743    OK    CONTENT_PRESENT_VALID_HASH    787414    hash://sha256/251e5032afce4f1e44bfdc5a8f0316ca1b317e8af41bdbf88163ab5bd2b52743<\/p>\n\nNote that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston".<\/p>\n\nFiles in this data publication:<\/p>\n\n--- start of file descriptions ---<\/p>\n\n-- description of archive and its contents (this file) --\nREADME<\/p>\n\n-- executable java jar containing preston[2] v0.1.15. --\npreston.jar<\/p>\n\n-- preston archives containing BHL data files, associated provenance logs and a provenance index --\npreston-[00-ff].tar.gz<\/p>\n\n-- individual provenance index files --\n2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a\n2b1104cb7749e818c9afca78391b2d0099bbb0a32f2b348860a335cd2f8f6800\n4081bc59dff58d63f6a86c623cb770f01e9a355a42495b205bcb538cd526190f\n47a2816f8b5600b24487093adcddfea12434cc4f270f3ab09d9215fbdd546cd2\n6f99a1388823fca745c9e22ac21e2da909a219aa1ace55170fa9248c0276903c\n7ae46d7cd9b5a0f5889ba38bac53c82e591b0bdf8b605f5e48c0dce8fb7b717f\n82903464889fea7c53f53daedf4e41fa31092f82619edeb3415eb2b473f74af3\n9e8c86243df39dd4fe82a3f814710eccf73aa9291d050415408e346fa2b09e70\na8308fbf4530e287927c471d881ce0fc852f16543d46e1ee26f1caba48815f3a\nbcec6df2ea7f74e9a6e2830d0072e6b2fbe65323d9ddb022dd6e1349c23996e2\ncfe47c25ec0210ac73c06b407beb20d9c58355cb15bae427fdc7541870ca2e4e\nf73fc9e70bce8f21f0c96b8ef0903749d8f223f71343ab5a8910968f99c9b8b6<\/p>\n\n--- end of file descriptions ---<\/p>\n\n\nReferences<\/p>\n\n[1] Biodiversity Heritage Library (BHL, https://biodiversitylibrary.org) accessed from 2019-05-19 to 2020-05-09 with provenance hash://sha256/34ccd7cf7f4a1ea35ac6ae26a458bb603b2f6ee8ad36e1a58aa0261105d630b1.\n[2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 .<\/p>\n\n\nThis work is funded in part by grant NSF OAC 1839201 from the National Science Foundation.<\/p>"]} 
    more » « less
  8. {"Abstract":["A biodiversity dataset graph: DataONE<\/p>\n\nThe intended use of this archive is to facilitate (meta-)analysis of the Data Observation Network for Earth (DataONE). DataONE is a distributed infrastructure that provides information about earth observation data.<\/p>\n\nThis dataset provides versioned snapshots of the DataONE network as tracked by Preston [2] between 2018-11-06 and 2020-05-07 using "preston update -u https://dataone.org".<\/p>\n\nThe archive consists of 256 individual parts (e.g., preston-00.tar.gz, preston-01.tar.gz, ...) to allow for parallel file downloads. The archive contains three types of files: index files, provenance logs and data files. In addition, index files have been individually included in this dataset publication to facilitate remote access. Index files provide a way to links provenance files in time to establish a versioning mechanism. Provenance files describe how, when, what and where the DataONE content was retrieved. For more information, please visit https://preston.guoda.bio or https://doi.org/10.5281/zenodo.1410543 .  <\/p>\n\nTo retrieve and verify the downloaded DataONE biodiversity dataset graph, first concatenate all the downloaded preston-*.tar.gz files (e.g., cat preston-*.tar.gz > preston.tar.gz). Then, extract the archives into a "data" folder. Alternatively, you can use the preston[2] command-line tool to "clone" this dataset using:<\/p>\n\n$$ java -jar preston.jar clone --remote https://zenodo.org/record/3849494/files<\/p>\n\nAfter that, verify the index of the archive by reproducing the following provenance log history:<\/p>\n\n$$ java -jar preston.jar history\n<0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/pav/hasVersion> <hash://sha256/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f> .\n<hash://sha256/3ed3acaca7ac57f546d0b8877c1927ab5e08c23eccaa8219600c59c77a72c685> <http://purl.org/pav/previousVersion> <hash://sha256/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f> .\n<hash://sha256/857753997a7595a1b372b05641b58a25d9408b7ff08d557ce1fe8b73e4bd383f> <http://purl.org/pav/previousVersion> <hash://sha256/3ed3acaca7ac57f546d0b8877c1927ab5e08c23eccaa8219600c59c77a72c685> .\n<hash://sha256/7ee0376f4c3f7aeeda36927a5211395e5da8201e810e8c7e638a0fe23d001e88> <http://purl.org/pav/previousVersion> <hash://sha256/857753997a7595a1b372b05641b58a25d9408b7ff08d557ce1fe8b73e4bd383f> .\n<hash://sha256/68b4974d8ab7c4c7a7a4305065839b60ba460aaa862590b34c67877738feba90> <http://purl.org/pav/previousVersion> <hash://sha256/7ee0376f4c3f7aeeda36927a5211395e5da8201e810e8c7e638a0fe23d001e88> .\n<hash://sha256/060a76d56255bf9482c951748c91291fddeeb20f180632132be1344e081b2372> <http://purl.org/pav/previousVersion> <hash://sha256/68b4974d8ab7c4c7a7a4305065839b60ba460aaa862590b34c67877738feba90> .\n<hash://sha256/29357bdfab4548025f8a5743301f5c3c9146fa436c39e3c9e019fb9409ac9c42> <http://purl.org/pav/previousVersion> <hash://sha256/060a76d56255bf9482c951748c91291fddeeb20f180632132be1344e081b2372> .\n<hash://sha256/3669cd95100d1d533eb8953ff4ec5092cbd8addb8879b3e6262191148a8a3ebb> <http://purl.org/pav/previousVersion> <hash://sha256/29357bdfab4548025f8a5743301f5c3c9146fa436c39e3c9e019fb9409ac9c42> .\n<hash://sha256/8dc1663299359d271cb1b4c14ad521d0f1be67743689dd18016543dc1e097efb> <http://purl.org/pav/previousVersion> <hash://sha256/3669cd95100d1d533eb8953ff4ec5092cbd8addb8879b3e6262191148a8a3ebb> .\n<hash://sha256/dc4903e8afee651db1d9bf509f20503bf9c8e89679c4bcffb46d5b97440cb6de> <http://purl.org/pav/previousVersion> <hash://sha256/8dc1663299359d271cb1b4c14ad521d0f1be67743689dd18016543dc1e097efb> .\n<hash://sha256/f3bed9db3092c744604df5f50248a2ec36e564fe78a65f45c4190283bd61c807> <http://purl.org/pav/previousVersion> <hash://sha256/dc4903e8afee651db1d9bf509f20503bf9c8e89679c4bcffb46d5b97440cb6de> .\n<hash://sha256/e3c7b3b14b2b792e3e2e560a1b2bef059ac93f777dee616b836317bc9cbfcbf7> <http://purl.org/pav/previousVersion> <hash://sha256/f3bed9db3092c744604df5f50248a2ec36e564fe78a65f45c4190283bd61c807> .\n<hash://sha256/631a4531e7bb052816d28454bbeec3428d5e7bfd1f148c4f21ce63a6cf86c650> <http://purl.org/pav/previousVersion> <hash://sha256/e3c7b3b14b2b792e3e2e560a1b2bef059ac93f777dee616b836317bc9cbfcbf7> .\n<hash://sha256/87de0898919d2212977a586965e930ae45bdd1366073591c808c208a635e2814> <http://purl.org/pav/previousVersion> <hash://sha256/631a4531e7bb052816d28454bbeec3428d5e7bfd1f148c4f21ce63a6cf86c650> .\n<hash://sha256/79ec3ee370a0d38311bc352af07a36380cd3aa04dc98154cf723bbc73d12ee77> <http://purl.org/pav/previousVersion> <hash://sha256/87de0898919d2212977a586965e930ae45bdd1366073591c808c208a635e2814> .\n<hash://sha256/e54b360a4ca84a4503e4c10a8a8cca062c130be7429c8fe6ea1e0e82fe113e12> <http://purl.org/pav/previousVersion> <hash://sha256/79ec3ee370a0d38311bc352af07a36380cd3aa04dc98154cf723bbc73d12ee77> .\n<hash://sha256/2910f784f84e112f124a56ce54bd06b76e510f90276629d2d144ce29e326d80f> <http://purl.org/pav/previousVersion> <hash://sha256/e54b360a4ca84a4503e4c10a8a8cca062c130be7429c8fe6ea1e0e82fe113e12> .\n<hash://sha256/bcb0bdff0689cfb06f586d057703e41d1c6ba409867232217081dd8cb5053c87> <http://purl.org/pav/previousVersion> <hash://sha256/2910f784f84e112f124a56ce54bd06b76e510f90276629d2d144ce29e326d80f> .\n<hash://sha256/a12f8c7fbf4fbfa71536c7e1b2614a35454dac6a7fe9e1cc0b4df41ab2269bef> <http://purl.org/pav/previousVersion> <hash://sha256/bcb0bdff0689cfb06f586d057703e41d1c6ba409867232217081dd8cb5053c87> .\n<hash://sha256/2b5c445f0b7b918c14a50de36e29a32854ed55f00d8639e09f58f049b85e50e3> <http://purl.org/pav/previousVersion> <hash://sha256/a12f8c7fbf4fbfa71536c7e1b2614a35454dac6a7fe9e1cc0b4df41ab2269bef> .<\/p>\n\nTo check the integrity of the extracted archive, confirm that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while.<\/p>\n\n$ java -jar preston.jar verify\nhash://sha256/e55c1034d985740926564e94decd6dc7a70f779a33e7deb931553739cda16945    file:/home/preston/preston-dataone/data/e5/5c/e55c1034d985740926564e94decd6dc7a70f779a33e7deb931553739cda16945    OK    CONTENT_PRESENT_VALID_HASH    21580    hash://sha256/e55c1034d985740926564e94decd6dc7a70f779a33e7deb931553739cda16945\nhash://sha256/d0ddcc2111b6134a570bcc7d89375920ef4d754130cecc0727c79d2b05a9f81f    file:/home/preston/preston-dataone/data/d0/dd/d0ddcc2111b6134a570bcc7d89375920ef4d754130cecc0727c79d2b05a9f81f    OK    CONTENT_PRESENT_VALID_HASH    2035    hash://sha256/d0ddcc2111b6134a570bcc7d89375920ef4d754130cecc0727c79d2b05a9f81f\nhash://sha256/472de9d1c9fd7e044aac409abfbfff9f12c6b69359df995d431009580ffb0f53    file:/home/preston/preston-dataone/data/47/2d/472de9d1c9fd7e044aac409abfbfff9f12c6b69359df995d431009580ffb0f53    OK    CONTENT_PRESENT_VALID_HASH    1935    hash://sha256/472de9d1c9fd7e044aac409abfbfff9f12c6b69359df995d431009580ffb0f53\nhash://sha256/b29879462cd43862129c5cf9b149c41ecd33ffef284a4dbea4ac1c0f90108687    file:/home/preston/preston-dataone/data/b2/98/b29879462cd43862129c5cf9b149c41ecd33ffef284a4dbea4ac1c0f90108687    OK    CONTENT_PRESENT_VALID_HASH    1553    hash://sha256/b29879462cd43862129c5cf9b149c41ecd33ffef284a4dbea4ac1c0f90108687<\/p>\n\n\nNote that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston".<\/p>\n\nFiles in this data publication:<\/p>\n\n--- start of file descriptions ---<\/p>\n\n-- description of archive and its contents (this file) --\nREADME<\/p>\n\n-- executable java jar containing preston[2] v0.1.15. --\npreston.jar<\/p>\n\n-- preston archives containing DataONE data files, associated provenance logs and a provenance index --\npreston-[00-ff].tar.gz<\/p>\n\n-- individual provenance index files --\n2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a\n2aecaf289def0e23a27058bf7715f226ef9189905f0be13228174825633125cf\n2f65ae542401d4c2daf1bca70de640211da6749188f67d28ea71acd7d8ba070b\n35eb1e17e2bf3e71212cde35bdb03e8a6545a57483ea3c1633929257b70cf637\n3d38b70198e448674be6a63d14b9817f3a956f48bba7418fa7baa086a56c05b7\n66ad3e5e904740f1e835ac6718dda4279e0c24b204ea0d1113cda1352a5072ba\n7466a35e42dea7e2be068060ec0c926f9a8686388ed504ef5c6c990c1ba4e8d0\n81161d9746c2a5823641c436e773fb4508516b055da85f4494b38c545349da39\n8bf062872ce958545d361e9d53a552ffb025ac29ab875caad1157c0995d34f66\na90eed8d70c54c8e554f2dfde4fceb434eda162d9615d62de96ded2344f88a78\nc33ef5e29100b323412f1f3bc66908c8e01e4f0d1db4ea3685d2fffc47981dd6\nc84dffef20fec958255e759db6445fc469d73695674a33ae6f7e567a088c9fe0\nd362d599d72000c4feb464db5a669b12e15fc3ca1a49b1e7d4d6f7d6d5d15411\nd9378616636be3686bbabd5bf29d50f0ef0e5ceb5ddd7dfce47f7e755b596b7d\nda26fa6e7371385ed3f61af9a766221c833060d59dfd4869bbd7110f95f288db\ne4103a75627857de3ee2e317429108611c244fc448c01d1d7bf652115c3b8a55\neb368fedb8f100210dd968edcf80f4d13cab3dd64135a6ab744102cf15e68c94\nf13ab4bca04f894ae8eabb51fa01b4dfbc69f717eabc9896c728e2ba39c4db27\nf493baf276892a199a0b0d078359f64a38fe8ad3f807921f8d41ef73f7343b1f\nff92b6c06ae5286bd2f1db679e0fcc4da294acb9bc01b2e9522378d99218c2e3<\/p>\n\n--- end of file descriptions ---<\/p>\n\n\nReferences<\/p>\n\n[1] Data Observation Network for Earth (DataONE, https://dataone.org) accessed from 2018-11-06 to 2020-05-07 with provenance hash://sha256/2b5c445f0b7b918c14a50de36e29a32854ed55f00d8639e09f58f049b85e50e3.\n[2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 .<\/p>\n\n\nThis work is funded in part by grant NSF OAC 1839201 from the National Science Foundation.<\/p>"]} 
    more » « less
  9. {"Abstract":["A biodiversity dataset graph: GBIF, iDigBio, BioCASe<\/p>\n\nThe intended use of this archive is to facilitate meta-analysis of the Global Biodiversity Information Facility, Integrated Digitized Biocollections, Biological Collection Access Service (GBIF, iDigBio, BioCASe). GBIF, iDigBio and BioCASe help provide access to biological data collections.<\/p>\n\nThis dataset provides versioned provenance logs of snapshots of the GBIF, iDigBio, BioCASe network as tracked by Preston [2] between 2018-09-03 and 2020-05-02 using "preston update -u https://gbif.org,https://idigbio.org,http://biocase.org".<\/p>\n\nThis publication contains two types of files: index files and provenance logs. Associated data files are hosted elsewhere for pragmatic reasons. Index files provide a way to link provenance files in time to establish a versioning mechanism. Provenance logs describe how, when, what and where the GBIF, iDigBio, BioCASe content was retrieved. For more information, please visit https://preston.guoda.bio or https://doi.org/10.5281/zenodo.1410543 .  <\/p>\n\nTo retrieve and verify the downloaded GBIF, iDigBio, BioCASe biodiversity dataset graph, use the preston[2] command-line tool to "clone" this dataset using:<\/p>\n\n$$ java -jar preston.jar ls --remote https://zenodo.org/record/3852671/files > /dev/null<\/p>\n\nOptionally, you can retrieve all associated data (>500GB) files using:<\/p>\n\n$$ java -jar preston.jar clone --remote https://zenodo.org/record/3852671/files,https://archive.org/download/biodiversity-dataset-archives/data.zip/data/,https://deeplinker.bio<\/p>\n\nPlease note https://archive.org/download/biodiversity-dataset-archives/data.zip/data/ and https://deeplinker.bio are Preston remotes that provided access to GBIF, iDigBio, BioCASe data files at time of writing (25 May 2020). These remotes can replaced with any other Preston remote(s) if needed. This may take a while depending on network speed and hardware constraints. See also https://archive.org/details/biodiversity-dataset-archives .<\/p>\n\nAfter that, verify the index of the archive by reproducing the following provenance log history:<\/p>\n\n$$ java -jar preston.jar history<\/p>\n\n<0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/pav/hasVersion> <hash://sha256/c253a5311a20c2fc082bf9bac87a1ec5eb6e4e51ff936e7be20c29c8e77dee55> .\n<hash://sha256/b83cf099449dae3f633af618b19d05013953e7a1d7d97bc5ac01afd7bd9abe5d> <http://purl.org/pav/previousVersion> <hash://sha256/c253a5311a20c2fc082bf9bac87a1ec5eb6e4e51ff936e7be20c29c8e77dee55> .\n<hash://sha256/7efdea9263e57605d2d2d8b79ccd26a55743123d0c974140c72c8c1cfc679b93> <http://purl.org/pav/previousVersion> <hash://sha256/b83cf099449dae3f633af618b19d05013953e7a1d7d97bc5ac01afd7bd9abe5d> .\n<hash://sha256/05a877bdb8617144fe166a13bf51828d4ad1bc11631c360b9e648a9f7df2bbcd> <http://purl.org/pav/previousVersion> <hash://sha256/7efdea9263e57605d2d2d8b79ccd26a55743123d0c974140c72c8c1cfc679b93> .\n<hash://sha256/b5a30bbd8d51e9faf08d4ddebbc5bda9bab1b12545172f1524ac5ebdb0038bd4> <http://purl.org/pav/previousVersion> <hash://sha256/05a877bdb8617144fe166a13bf51828d4ad1bc11631c360b9e648a9f7df2bbcd> .\n<hash://sha256/1d3817d9cb9fc7de7a3b7a4181daba8de1e52b348280154e8a163c7dd7ee1a7e> <http://purl.org/pav/previousVersion> <hash://sha256/b5a30bbd8d51e9faf08d4ddebbc5bda9bab1b12545172f1524ac5ebdb0038bd4> .\n<hash://sha256/24b3f981c88c747f44ad3372095767cd15dcf81bd6cd2e54328a90a21409df43> <http://purl.org/pav/previousVersion> <hash://sha256/1d3817d9cb9fc7de7a3b7a4181daba8de1e52b348280154e8a163c7dd7ee1a7e> .\n<hash://sha256/ba02b235fd445904eae45b50bc637a195f25e9ca1637bcf26b2dc7f8698aa1fe> <http://purl.org/pav/previousVersion> <hash://sha256/24b3f981c88c747f44ad3372095767cd15dcf81bd6cd2e54328a90a21409df43> .\n<hash://sha256/102cbfb1e800ef795ba1e1c51a34bff9b463b34c9443435069ddc76970c1e9c9> <http://purl.org/pav/previousVersion> <hash://sha256/ba02b235fd445904eae45b50bc637a195f25e9ca1637bcf26b2dc7f8698aa1fe> .\n<hash://sha256/fd27b0552c8a6800a8b3b1b822a2063a3215c1d9887badad09a62746b80846bc> <http://purl.org/pav/previousVersion> <hash://sha256/102cbfb1e800ef795ba1e1c51a34bff9b463b34c9443435069ddc76970c1e9c9> .\n<hash://sha256/20d36a6f879ba1dd797d4288a4f2e32719d3c674156194c2765a3ec6b43f5e17> <http://purl.org/pav/previousVersion> <hash://sha256/fd27b0552c8a6800a8b3b1b822a2063a3215c1d9887badad09a62746b80846bc> .\n<hash://sha256/7801a034fe3c7920e032d2338a690b700ca41a90a92d878fc3a67111cad16d29> <http://purl.org/pav/previousVersion> <hash://sha256/20d36a6f879ba1dd797d4288a4f2e32719d3c674156194c2765a3ec6b43f5e17> .\n<hash://sha256/c1b50502b1ca87046eeb7fe4863d0cf9319b6645ff2142db69f21b4cc23332b6> <http://purl.org/pav/previousVersion> <hash://sha256/7801a034fe3c7920e032d2338a690b700ca41a90a92d878fc3a67111cad16d29> .\n<hash://sha256/dc293e26154b89273791b9674d81110029f987c686b386184d0b66a5b95f9cda> <http://purl.org/pav/previousVersion> <hash://sha256/c1b50502b1ca87046eeb7fe4863d0cf9319b6645ff2142db69f21b4cc23332b6> .\n<hash://sha256/f3ed6aa1bd15ee43d05e138b935040aaa745f6ca8c7e8f2dfbb0a3ae0df66f36> <http://purl.org/pav/previousVersion> <hash://sha256/dc293e26154b89273791b9674d81110029f987c686b386184d0b66a5b95f9cda> .\n<hash://sha256/650a28fff3e03dadba70dc05a34c580c04203380187953fa4a2fb778353fee79> <http://purl.org/pav/previousVersion> <hash://sha256/f3ed6aa1bd15ee43d05e138b935040aaa745f6ca8c7e8f2dfbb0a3ae0df66f36> .\n<hash://sha256/e4e5736e8bfec6c686eedde4c6dfa62845930d04e12dfa6f8a7d70abc3d087df> <http://purl.org/pav/previousVersion> <hash://sha256/650a28fff3e03dadba70dc05a34c580c04203380187953fa4a2fb778353fee79> .\n<hash://sha256/e69d186ff3be11830c2da67d1bfeb896ec6398fc9d555fa26eaae1baa54450fb> <http://purl.org/pav/previousVersion> <hash://sha256/e4e5736e8bfec6c686eedde4c6dfa62845930d04e12dfa6f8a7d70abc3d087df> .\n<hash://sha256/3e7f19a8a78b51437240f49c499e6e7f89b8d58d4e3ceb9480d4356721645cee> <http://purl.org/pav/previousVersion> <hash://sha256/e69d186ff3be11830c2da67d1bfeb896ec6398fc9d555fa26eaae1baa54450fb> .\n<hash://sha256/5c469224fa0b6159bf33a59ddaa0246634e81bddd1728e7bf3540745055eccfa> <http://purl.org/pav/previousVersion> <hash://sha256/3e7f19a8a78b51437240f49c499e6e7f89b8d58d4e3ceb9480d4356721645cee> .\n<hash://sha256/eb2c716ec85158a0785216de1b09965173fc368d12f213c1bf747bbc2e49c6a6> <http://purl.org/pav/previousVersion> <hash://sha256/5c469224fa0b6159bf33a59ddaa0246634e81bddd1728e7bf3540745055eccfa> .\n<hash://sha256/3dd674b7ad16391629948981a9cb6f6f86937d016861c3e59cd6e6bf3589f3b7> <http://purl.org/pav/previousVersion> <hash://sha256/eb2c716ec85158a0785216de1b09965173fc368d12f213c1bf747bbc2e49c6a6> .\n<hash://sha256/480868b59e95f3ce2324a7308dba65795e857d34cfbdcea7440a6f2620c6fbf6> <http://purl.org/pav/previousVersion> <hash://sha256/3dd674b7ad16391629948981a9cb6f6f86937d016861c3e59cd6e6bf3589f3b7> .\n<hash://sha256/58daa9a51e5dc0911163aa1b98d68c801106734cd29eab9980814057351aeb70> <http://purl.org/pav/previousVersion> <hash://sha256/480868b59e95f3ce2324a7308dba65795e857d34cfbdcea7440a6f2620c6fbf6> .\n<hash://sha256/a0a18b0e32f933112084b846863438038f66f63eeeb22fa9d8d734e8a25bb208> <http://purl.org/pav/previousVersion> <hash://sha256/58daa9a51e5dc0911163aa1b98d68c801106734cd29eab9980814057351aeb70> .\n<hash://sha256/a7a5e7c6a4b21bdf67f48d6bea85f438b8133f674027b04625dfadec3ff985f6> <http://purl.org/pav/previousVersion> <hash://sha256/a0a18b0e32f933112084b846863438038f66f63eeeb22fa9d8d734e8a25bb208> .\n<hash://sha256/0e6b49850d96b4b58ea3759ecea45d273a48f074c4edaaec5e008791d7718781> <http://purl.org/pav/previousVersion> <hash://sha256/a7a5e7c6a4b21bdf67f48d6bea85f438b8133f674027b04625dfadec3ff985f6> .\n<hash://sha256/8c0752dc6425b9c716837c9713ce284158b4cff70a1e66be2beb0677018831f4> <http://purl.org/pav/previousVersion> <hash://sha256/0e6b49850d96b4b58ea3759ecea45d273a48f074c4edaaec5e008791d7718781> .\n<hash://sha256/d99fa37caa268f8061980001146ed2a566e814d0740bb1974b76847512be95d3> <http://purl.org/pav/previousVersion> <hash://sha256/8c0752dc6425b9c716837c9713ce284158b4cff70a1e66be2beb0677018831f4> .\n<hash://sha256/af0bb2c89571a30815d4488e72dede84a2ffc102bb87961f06884509fd5d1dae> <http://purl.org/pav/previousVersion> <hash://sha256/d99fa37caa268f8061980001146ed2a566e814d0740bb1974b76847512be95d3> .\n<hash://sha256/261177a96185166f1c301beacf7350abff03d1b5710be6bfd8c4aff9caffef12> <http://purl.org/pav/previousVersion> <hash://sha256/af0bb2c89571a30815d4488e72dede84a2ffc102bb87961f06884509fd5d1dae> .\n<hash://sha256/5a39b7bbe9d1bc46ed2eb7bd76c490b5c85a09369a7cf7dc18fa04532679e9a7> <http://purl.org/pav/previousVersion> <hash://sha256/261177a96185166f1c301beacf7350abff03d1b5710be6bfd8c4aff9caffef12> .\n<hash://sha256/af8f9ed321d9c403617f54a96e3217adc918970fbbfe8b8715359669f4890b63> <http://purl.org/pav/previousVersion> <hash://sha256/5a39b7bbe9d1bc46ed2eb7bd76c490b5c85a09369a7cf7dc18fa04532679e9a7> .\n<hash://sha256/9a41d2583f0b8169ffdd44fb2d3a5e057eba4a10e5d9193d0c6e9dcf07c3119e> <http://purl.org/pav/previousVersion> <hash://sha256/af8f9ed321d9c403617f54a96e3217adc918970fbbfe8b8715359669f4890b63> .\n<hash://sha256/b9864a749112cad2fe19e62bf5d8bad580a7036d363d16d81d5c16be325fa0fd> <http://purl.org/pav/previousVersion> <hash://sha256/9a41d2583f0b8169ffdd44fb2d3a5e057eba4a10e5d9193d0c6e9dcf07c3119e> .\n<hash://sha256/09574d9c1330c2b1bec9b7bf3a55ab9273bedbfed78affd70a058a1a25e052d2> <http://purl.org/pav/previousVersion> <hash://sha256/b9864a749112cad2fe19e62bf5d8bad580a7036d363d16d81d5c16be325fa0fd> .\n<hash://sha256/668d5d6e9c9e7ddb410073ff75eb7f2935c60cc62944ba1fd96ca60feec4a103> <http://purl.org/pav/previousVersion> <hash://sha256/09574d9c1330c2b1bec9b7bf3a55ab9273bedbfed78affd70a058a1a25e052d2> .\n<hash://sha256/6387c9ebed9507a0fbba2d161e83c2da73e0d6fa6dd51fb19ac4a4ca75b839c7> <http://purl.org/pav/previousVersion> <hash://sha256/668d5d6e9c9e7ddb410073ff75eb7f2935c60cc62944ba1fd96ca60feec4a103> .\n<hash://sha256/d79fb9207329a2813b60713cf0968fda10721d576dcb7a36038faf18027eebc1> <http://purl.org/pav/previousVersion> <hash://sha256/6387c9ebed9507a0fbba2d161e83c2da73e0d6fa6dd51fb19ac4a4ca75b839c7> .\n<hash://sha256/6fb7271a2da1543036e39bcdb4c415a46b5437569eaaf0ffdef3e907a2f4309f> <http://purl.org/pav/previousVersion> <hash://sha256/d79fb9207329a2813b60713cf0968fda10721d576dcb7a36038faf18027eebc1> .\n<hash://sha256/ab62f4a9601f30d23353a479830f9d2dfc7898e15d2cc2d81977e898d885c908> <http://purl.org/pav/previousVersion> <hash://sha256/6fb7271a2da1543036e39bcdb4c415a46b5437569eaaf0ffdef3e907a2f4309f> .\n<hash://sha256/ff74959ec6e5e98e7db674afcb915f50725f049b968e9a9f10de169aa0a3dcb5> <http://purl.org/pav/previousVersion> <hash://sha256/ab62f4a9601f30d23353a479830f9d2dfc7898e15d2cc2d81977e898d885c908> .\n<hash://sha256/6c4c94cdb224d39e7c655b1a1a6afbba8daf3c9ac64c42ba72dfd346d5d3a547> <http://purl.org/pav/previousVersion> <hash://sha256/ff74959ec6e5e98e7db674afcb915f50725f049b968e9a9f10de169aa0a3dcb5> .\n<hash://sha256/9c17ce013b33c3c9e6bc513cb49a14660fad9bd6f87a4f21568cc871b10ba39b> <http://purl.org/pav/previousVersion> <hash://sha256/6c4c94cdb224d39e7c655b1a1a6afbba8daf3c9ac64c42ba72dfd346d5d3a547> .\n<hash://sha256/5dcf876c6cb0c5b15197acf1ea6989d41c1a1333c6a7e0437f035aa9d22a3790> <http://purl.org/pav/previousVersion> <hash://sha256/9c17ce013b33c3c9e6bc513cb49a14660fad9bd6f87a4f21568cc871b10ba39b> .\n<hash://sha256/39f83f5805f32f765003c5e9ee8c69adb3889d9f26dd61bf4aa3a829ac744e2c> <http://purl.org/pav/previousVersion> <hash://sha256/5dcf876c6cb0c5b15197acf1ea6989d41c1a1333c6a7e0437f035aa9d22a3790> .\n<hash://sha256/916255b2b73680595dcb22b30991a757dd223208473fb4fbe90405757bc07953> <http://purl.org/pav/previousVersion> <hash://sha256/39f83f5805f32f765003c5e9ee8c69adb3889d9f26dd61bf4aa3a829ac744e2c> .\n<hash://sha256/3b39831bcc286c1db44787e21b736378f5847a16b7c39bdac3dd2011e9189dc1> <http://purl.org/pav/previousVersion> <hash://sha256/916255b2b73680595dcb22b30991a757dd223208473fb4fbe90405757bc07953> .\n<hash://sha256/f13b15a20e4fe70b4a111e67ac20ef676404b8456dfc39694f2cb3a4c62a2b2d> <http://purl.org/pav/previousVersion> <hash://sha256/3b39831bcc286c1db44787e21b736378f5847a16b7c39bdac3dd2011e9189dc1> .\n<hash://sha256/8aacce08462b87a345d271081783bdd999663ef90099212c8831db399fc0831b> <http://purl.org/pav/previousVersion> <hash://sha256/f13b15a20e4fe70b4a111e67ac20ef676404b8456dfc39694f2cb3a4c62a2b2d> .<\/p>\n\n\nIf you retrieved data files, you can check the integrity of the extracted archive by confirming that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while.<\/p>\n\n$$ java -jar preston.jar verify\nhash://sha256/3eff98d4b66368fd8d1f8fa1af6a057774d8a407a4771490beeb9e7add76f362    file:/home/preston/preston-archive/data/3e/ff/3eff98d4b66368fd8d1f8fa1af6a057774d8a407a4771490beeb9e7add76f362    OK    CONTENT_PRESENT_VALID_HASH    89931\nhash://sha256/184886cc6ae4490a49a70b6fd9a3e1dfafce433fc8e3d022c89e0b75ea3cda0b    file:/home/preston/preston-archive/data/18/48/184886cc6ae4490a49a70b6fd9a3e1dfafce433fc8e3d022c89e0b75ea3cda0b    OK    CONTENT_PRESENT_VALID_HASH    210344\nhash://sha256/1846abf2b9623697cf9b2212e019bc1f6dc4a20da51b3b5629bfb964dc808c02    file:/home/preston/preston-archive/data/18/46/1846abf2b9623697cf9b2212e019bc1f6dc4a20da51b3b5629bfb964dc808c02    OK    CONTENT_PRESENT_VALID_HASH    210344\nhash://sha256/554fdab07f2372bf363a1d7ef30fcf4c32e1da98b95a6342780c5eb35e0e7b38    file:/home/preston/preston-archive/data/55/4f/554fdab07f2372bf363a1d7ef30fcf4c32e1da98b95a6342780c5eb35e0e7b38    OK    CONTENT_PRESENT_VALID_HASH    202701<\/p>\n\nNote that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston".<\/p>\n\nFiles in this data publication:<\/p>\n\n--- start of file descriptions ---<\/p>\n\n-- description of archive and its contents (this file) --\nREADME<\/p>\n\n-- executable java jar containing preston[2] v0.1.15. --\npreston.jar<\/p>\n\n-- individual provenance index files --<\/p>\n\n049b0eb995b484c1e64184f582f51b3c608dcade70c4aefc2d53f903bae45098\n073315c32d7fd19868449bef1b11b15a86981dee53a31f7f5c882f7e3be413c3\n1172c6927e58113db668409d36b6a2cd84cf1a93e85b50d65d0bd008a5d8aaa4\n1707cb11cd9f696f1a86fd06742c1e14fad856747be88791f79f6fc7c979d5a6\n272ff1f12a573c667634d934d06b8bab0dd9cc6558795287ea99fab87620d005\n2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a\n2bbbe11bb1932c6c8fbbc2ed16dde182f53c4cecbe0dd4f779c32f527a61bc62\n37b8b636e939072d0df7246bf077ead4279f9dd33929be322e631104b0641308\n3901b6af522d535fb164823704686e72f73b7798a2a64eaeb817134552c69e2c\n395ed0c95a624f8853116442690965acf69151acd6b33cc4fc710f567828f784\n460c14ed0129c1469c9149ed1030cdc133f110fb32048748323982cb88dd7eda\n477b6c4e9ecf5c8cd1b5502e0245c8622fa4b358f6710f97db39b473ed3d8235\n52b7274f5d795e4987964bb1a327dd6d6e4f65870e6a7aac172481d0ba3013d4\n54786bde04751bc31bf38c9e89c010cfee7de91760e1f5f31218ff11acff8a70\n6135b237a49b37b857801836494f2c36bcb1526bdacf001a9d11727fff6bf1f1\n674937568c0572bc2873f502dca2fe691ba230869f0aba73f5938422654c05cc\n69b4d5ca9643c14501a48a2b1eb24971a6da68da5033c304f7f00b94e16a11d9\n6df3363a236d4f026154ef86b34d9672b111333d0c2be179c43db146864f6ed3\n70066ea7c6a9dd6c2193cdc90b3b1ff7664af235ab245f6c03d1dd497b376570\n7084702f8025c99a6608a3355ccad5ff5e644ad544121f5d524961f7fe29ceb6\n7e9934a1fc580c3f591c295306ab364c2e7a589e91590ab6334514e4b5c28062\n7ebb008412baaac3afcc8af68b796bf4ca98f367cfd61a815eee82cdffeab196\n886edb8d22973bb04fe3b42d12106029a00b9deab3fb77d8787123327b77ae3b\n8a2426eb4b38af30c6ee764463b8684e0dec400e4472a2a53e6eedf246dab178\n8a6d7e2ab026ff56380235fd9696f5e538e5e426b9374f2ddf3a705e186a7788\n8d44c9e36a505e5c3f125e1702ef7473280bf5bcfa624fe5d3998694b67e0887\n94290680edef0f8ac81d5d4d5b8b680ba5ce821df17c4de62464429552c3360e\n95f88f27ed3448534206406738dfb5c5030fe3d6883c6dda261649357600883f\n9d12cae409e8ea0a546f7945cc629d622400000c3338e4710d9c6084fca9274d\n9fa9ea50db419c75251026708183add8973d9e68a79062f7808b110bef21006e\na24abbe089556f51fe9c2a51febdcaf893b419556312bcc63515713fc4a52922\na3b0477fe46f09b0f51c0f651691665c149bc341f5c19996675d849252e86453\na486474333f05884580dd10c54c95999063c7d1bc22e2cbe3bead604aca0a183\na524b9af3f172793998e1f9c5c0e9c949cc935624a17ed3364d32bc0391c9382\naa0e508aeb96f240b551fe92ff4224325ddcdf66f97eef95ac78aec62e53a169\nab34300942ec02cca7adf2744f6fbc1ab7587060bea09ef92b65b66f89d1ddcd\nb05d4a17d9a02180669d7eb017102dd1a739fb4615759cba94baf944b2aee29c\nb37c79f95c22fc4d657cc89dedd7a870923285da690ad4f5121962492484a142\nbc699639e5515a5fc9da9d442357cc8a9ff310a177e54f1646e002723de49f1d\nbe6d8cd5f1405a5e3e8aa492fb8dab41f6521608834d746e6cbc58d2f550f918\nc06f4413a97a5540fbdd40bdbfb194435c154533df7fe388dfdd378084e19c3d\nc585b8addfb7f7991ad74c0bae158aecefc6be5b11c28b020135e0f13040e187\nc66587e9730a6f68e961240038892df656ea99a1a25f4ff8ce556c07b09a4878\nca289dce66c8b9955c223fe3e906b8f26c12cf53506cebe651b004961f7964af\ncea1aab236de5de8da8954797d846c225bf2ad4f8fe3cd413e60ab029f9e1b3e\nda05cc27a47e755ebe912fafae434df5bd31a5d92658fe1943acc0a2023fab32\nee473aeda889fd12ac2c76aae06314e5f279cce5f1a736d39bfc097657a82060\nfcb2ee4d630a9a1440417b0c46da5bc1578a388d6aedd12189a23283b60dde7d\nfef548489bd7bea43ae1c2b7755d38a87f4a8b038a466bf7e7b4ac64d665fd62\nff32a7cbc99eaf6b67695fd94284a9b1b47a76497ef4d10ffc4dae199cc0d7c3<\/p>\n\n--- individual provenance logs --<\/p>\n\n05a877bdb8617144fe166a13bf51828d4ad1bc11631c360b9e648a9f7df2bbcd\n09574d9c1330c2b1bec9b7bf3a55ab9273bedbfed78affd70a058a1a25e052d2\n0e6b49850d96b4b58ea3759ecea45d273a48f074c4edaaec5e008791d7718781\n102cbfb1e800ef795ba1e1c51a34bff9b463b34c9443435069ddc76970c1e9c9\n1d3817d9cb9fc7de7a3b7a4181daba8de1e52b348280154e8a163c7dd7ee1a7e\n20d36a6f879ba1dd797d4288a4f2e32719d3c674156194c2765a3ec6b43f5e17\n24b3f981c88c747f44ad3372095767cd15dcf81bd6cd2e54328a90a21409df43\n261177a96185166f1c301beacf7350abff03d1b5710be6bfd8c4aff9caffef12\n39f83f5805f32f765003c5e9ee8c69adb3889d9f26dd61bf4aa3a829ac744e2c\n3b39831bcc286c1db44787e21b736378f5847a16b7c39bdac3dd2011e9189dc1\n3dd674b7ad16391629948981a9cb6f6f86937d016861c3e59cd6e6bf3589f3b7\n3e7f19a8a78b51437240f49c499e6e7f89b8d58d4e3ceb9480d4356721645cee\n480868b59e95f3ce2324a7308dba65795e857d34cfbdcea7440a6f2620c6fbf6\n58daa9a51e5dc0911163aa1b98d68c801106734cd29eab9980814057351aeb70\n5a39b7bbe9d1bc46ed2eb7bd76c490b5c85a09369a7cf7dc18fa04532679e9a7<\/p>"]} 
    more » « less